The AT&T WATSON Speech Recognizer

نویسندگان

Vincent Goffin

Cyril Allauzen

Enrico Bocchieri

Dilek Z. Hakkani-Tür

Andrej Ljolje

Sarangarajan Parthasarathy

Mazin G. Rahim

Giuseppe Riccardi

Murat Saraclar

چکیده

This paper describes the AT&T WATSON real-time speech recognizer, the product of several decades of research at AT&T. The recognizer handles a wide range of vocabulary sizes and is based on continuous-density hidden Markov models for acoustic modeling and finite state networks for language modeling. The recognition network is optimized for efficient search. We identify the algorithms used for high-accuracy, real-time and low-latency recognition. We present results for small and large vocabulary tasks taken from the AT&T VoiceTone R © service, showing word accuracy improvement of about 5% absolute and real-time processing speed-up by a factor between 2 and 3.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مدل میکروسکوپی دوگوشی مبتنی بر فیلتر بانک مدولاسیون برای پیش گویی قابلیت فهم گفتار در افراد دارای شنوایی عادی

In this study, a binaural microscopic model for the prediction of speech intelligibility based on the modulation filter bank is introduced. So far, the spectral criteria such as the STI and SII or other analytical methods have been used in the binaural models to determine the binaural intelligibility. In the proposed model, unlike all models of binaural intelligibility prediction, an automatic ...

متن کامل

ALGONQUIN: iterating laplace's method to remove multiple types of acoustic distortion for robust speech recognition

One approach to robust speech recognition is to use a simple speech model to remove the distortion, before applying the speech recognizer. Previous attempts at this approach have relied on unimodal or point estimates of the noise for each utterance. In challenging acoustic environments, e.g., an airport, the spectrum of the noise changes rapidly during an utterance, making a point estimate a po...

متن کامل

Unsupervised training of an HMM-based self-organizing unit recognizer with applications to topic classification and keyword discovery

We present our approach to unsupervised training of speech recognizers. Our approach iteratively adjusts sound units that are ptimized for the acoustic domain of interest. We thus enable the use of speech recognizers for applications in speech domains here transcriptions do not exist. The resulting recognizer is a state-of-the-art recognizer on the optimized units. Specifically we ropose buildi...

متن کامل

An effective feature compensation scheme tightly matched with speech recognizer employing SVM-based GMM generation

This paper proposes an effective feature compensation scheme to address a real-life situation where clean speech database is not available for Gaussian Mixture Model (GMM) training for a model-based feature compensation method. The proposed scheme employs a Support Vector Machine (SVM)based model selection method to effectively generate the GMM for our feature compensation method directly from ...

متن کامل

Rapid Match Training for Large Vocabularies

This paper describes a new algorithm for building rapid match models for use in Dragon's continuous speech recognizer. Rather than working from a single representative token for each word, the new procedure works directly from a se t of trained hidden Markov models. By simulated traversals of the HMMs, we generate a collection of sample tokens for each word which are then averaged together to b...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

The AT&T WATSON Speech Recognizer

نویسندگان

چکیده

منابع مشابه

مدل میکروسکوپی دوگوشی مبتنی بر فیلتر بانک مدولاسیون برای پیش گویی قابلیت فهم گفتار در افراد دارای شنوایی عادی

ALGONQUIN: iterating laplace's method to remove multiple types of acoustic distortion for robust speech recognition

Unsupervised training of an HMM-based self-organizing unit recognizer with applications to topic classification and keyword discovery

An effective feature compensation scheme tightly matched with speech recognizer employing SVM-based GMM generation

Rapid Match Training for Large Vocabularies

عنوان ژورنال:

اشتراک گذاری